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Abstract 

Objective: The purpose of the study was to examine the extent to which the School- wide 
Universal Behavior Sustainability Index: School Teams (SUBSIST; McIntosh, Doolittle, 

Vincent, Homer, & Ervin, 2009), a measure of school and district contextual factors that promote 
the sustainability of school practices, demonstrated measurement invariance across groups of 
schools that differed in length of time implementing School- wide Positive Behavioral 
Interventions and Supports (PBIS; Sugai & Homer, 2009), student ethnic composition, and 
student socio-economic status (SES). 

Method: School PBIS team members and district coaches representing 860 schools in 14 U.S. 
states completed the SUBSIST. 

Results: Bindings supported strong measurement invariance, for all items except one, of a model 
with two school-level factors (School Priority and Team Use of Data) and two district-level 
factors (District Priority and Capacity Building) across groups of schools at initial 
implementation, institutionalization, and sustainability phases of PBIS implementation. Schools 
in the sustainability phase were rated significantly higher on School Priority and Team Use of 
Data than schools in initial implementation. Strong measurement invariance held across groups 
of schools that differed in student ethnicity and SES. 

Conclusions: The findings regarding measurement invariance are important for future 
longitudinal investigations of factors that may promote the sustained implementation of school 
practices. 

Keywords: positive behavior interventions and supports, sustainability, measurement invariance 
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Measurement Invariance of an Instrument Assessing Sustainability 
of School-based Universal Behavior Practices 
The process of implementing new practices in schools and other agencies has been 
conceived as occurring in a set of predictable stages of implementation (Eixsen, Blase, Duda, 
Naoom, & Van Dyke, 2010). Adelman and Taylor (1997) identified these stages as (a) creating 
readiness, (b) initial implementation, (c) institutionalization, and (d) ongoing evolution, with 
creating readiness occurring before actual use of the intervention with the intended recipients, 
and initial implementation, institutionalization, and ongoing evolution occurring as the 
intervention is used. Across the literature, each of the various conceptualizations of these phases 
places sustainability as the final phase and ultimate goal of implementation (Adelman & Taylor, 
2003; Cobum, 2003; Eixsen, Naoom, Blase, Eriedman, & Wallace, 2005). Assessing and 
predicting sustainability are important goals for research and practice because of its positive 
effects not only on students (e.g., improved student functioning and long term outcomes; Cook & 
Odom, 2013; Sanford DeRousie & Bierman, 2012), but also on systems and educators (e.g., 
improved teacher self-efficacy, organizational health; Baker, Gersten, Dimino, & Griffiths, 2004; 
Bradshaw, Koth, Bevans, lalongo, & Eeaf, 2008). The purpose of this study is to advance the 
assessment of school practice sustainability by investigating the measurement invariance of an 
established sustainability measure across schools at varied stages of PBIS implementation and 
with varied student ethnic composition and socioeconomic status (SES) levels. 

Conceptualization of Sustainability 

Sustainability is an elusive concept to measure because the construct is not as 
straightforward as is often considered (Vaughn, Klingner, & Hughes, 2000). Although 
sustainability has sometimes been viewed as synonymous with maintenance because it seems on 
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the surface to be characterized as achieving the same result with the same practice, the actual 
process of sustainability involves iterative changes over time to make the practice more effective, 
efficient, and relevant to the context (Castro, Barrera, & Martinez, 2004; McIntosh, Eilter, 
Bennett, Ryan, & Sugai, 2010; McEaughlin & Mitra, 2001). In addition, sustainability is to some 
extent an outcome in of itself (i.e., continued adherence over time; Han & Weiss, 2005), but it 
can also be considered to be the potential for a practice to be sustained over time, a constellation 
of factors that make continued adherence more likely (McIntosh & Turri, in press). This latter 
conceptualization is more useful for practice because it allows for assessment, prediction, and 
systems-level interventions for sustainability at the start of implementation, rather than waiting 
until it has been achieved (Pluye, Potvin, Denis, Pelletier, & Mannoni, 2005). 

McIntosh and colleagues (2009) developed a model of sustainability of school-based 
practices that identifies hypothesized factors and mechanisms by which these practices can be 
sustained. Eor example, practice priority (including staff buy-in, administrator support, and 
integration into daily responsibilities) provides the stimulus to continue implementation, even 
when considering competing initiatives that are alternatives to the practice. Another factor, 
collection and use of data for decision making is the mechanism by which school teams engage 
in ongoing evolution that results in adaptations that improve practices, rather than those that 
remove their effective components (McEaughlin & Mitra, 2001). 

Assessment of Sustainability 

To assess the contextual features most closely related to sustainability in the model, 
McIntosh and colleagues developed a measure, the School-wide Universal Behavior 
Sustainability Index: School Teams (SUBSIST; McIntosh, Doolittle, et ah, 2009). The SUBSIST 
was developed as a self-administered measure of contextual and practice variables predicting 
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implementation and sustainability of school-wide behavior support practices (e.g., programs, 
curricula) delivered to all students. Development included review of the items and response 
process by an expert panel and piloting with school teams (McIntosh et ah, 201 1), a large-scale 
study of perceived importance of items by practitioners (McIntosh et ah, 2014), a cluster analysis 
and construct validation (Hume & McIntosh, in press), and large-scale factor analysis with 
prediction of fidelity of practice implementation (McIntosh et ah, 2013). Although developed for 
use with any universal practice, to date the measure has been validated with School-wide 
Positive Behavioral Interventions and Supports (PBIS; Sugai & Horner, 2009), a framework for 
implementing school-based interventions to increase prosocial behavior and decrease problem 
behavior through environmental redesign, explicit instruction, acknowledgement of prosocial 
behavior, and team-based use of fidelity of implementation and student outcomes data for 
continuous improvement. This approach has been implemented in over 18,000 schools in the US 
and schools in over a dozen countries (Sugai, 2012, October). 

Despite this recent research validating the SUBSIST with schools adopting PBIS, 
additional validation is needed for the measure to contribute to important research in predicting 
and promoting sustainability. Eirst, examining the measure’s factor structure in a larger sample 
would be helpful in determining the extent to which prior findings (McIntosh et ah, 2013) are 
replicated in an independent, cross-validation sample. Additionally, because the definition of 
sustainability implies implementation over long periods of time (Eucyshyn et ah, 2007), any 
measure assessing sustainability must have a consistent factor structure at differing periods of 
time of implementation. Of theoretical and practical interest, sustainability can be assessed for 
schools at different stages of implementation to identify whether the SUBSIST factor structure 
holds across initial implementation, institutionalization, and ongoing evolution. Theoretically, it 
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would be expected that as schools continue with successful implementation and weather various 
barriers to sustainability, sustainability scores would increase over time, as has been shown with 
a smaller sample (Hume & McIntosh, in press). Prior to investigating factor score differences 
across stages of implementation, though, establishing measurement invariance of the SUBSIST 
is necessary to ensure that observed differences are not artifacts of measurement-related 
differences (Wu, Ei, & Zumbo, 2007). 

Measurement invariance of the SUBSIST across stages of implementation may not hold 
because staff in schools with varying durations of PBIS implementation could interpret or value 
SUBSIST items differently. Prior research has indicated that school personnel rate perceptions 
that PBIS is part of systems already in use, integration into new initiatives, family engagement, 
and staff support as more important for schools in the sustainability phase than initial 
implementation (McIntosh et ah, 2014). Because these particular variables are viewed as more 
important for schools in the sustainability phase, it is possible that items assessing these areas on 
the SUBSIST could be differentially related to the School Priority factor, resulting in non- 
invariance across stages of implementation. Without assessment of invariance based on stage of 
implementation, any observed differences in scale scores could be due to actual differences or 
variations in scale psychometrics across implementation stages (Wu et ah, 2007). Similarly, in 
the context of future longitudinal research, it is important to establish measurement equivalence 
to ensure that differences observed over time are not measurement-related artifacts. 

In addition to investigating measurement invariance across groups with varying duration 
of PBIS implementation, it is important to consider the extent to which SUBSIST measurement 
invariance holds across schools with varied student populations. The impact of school ethnic 
composition and culture on the effectiveness of PBIS and the specific practices and interventions 
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included in local implementations of the PBIS framework has received increased attention in the 
research literature (Sugai, O’Keeffe, & Fallon, 2012; Vincent & Tobin, 2011), and similar efforts 
could be observed in the area of sustainability of PBIS. Prior to investigating variability in 
SUBSIST scores across schools with varied student populations, however, it is important to 
investigate the extent to which measurement of factors related to sustainability is consistent 
across these schools. 

Purpose of the Study 

The purpose of the study was to cross-validate the initial factor structure of the SUBSIST 
in a larger, independent sample and assess measurement invariance across three periods of time, 
corresponding theoretically to three stages of implementation: initial implementation, 
institutionalization, and ongoing evolution. Invariance across time would provide both evidence 
of the measure’s psychometric adequacy for assessing longitudinal sustainability, as well as 
provide insight regarding how sustainability changes based on the phase of implementation. In 
addition, measurement invariance across groups of schools that differed in student ethnic 
composition and SES was assessed. The specific research questions tested were as follows: 

1. To what extent is measurement with the SUBSIST invariant across three stages of 
practice implementation and schools with varied student ethnic composition and 
SES? 

2. Are there mean differences on the subscales of the SUBSIST across stages of 
implementation and groups of schools with varied student ethnic composition and 
SES? 


Method 


Participants and Settings 
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School PBIS team members and district coaches representing a total of 860 schools 
implementing PBIS participated in the study. Of the 860 participants, 61% were school PBIS 
team leaders, 24% were school administrators, 9% were other faculty or staff on the school PBIS 
team, and 5% were external (e.g., district-level or regional) PBIS coaches. Of the 860 schools, 
212 schools (in 149 districts) were in year 0 (planning year) or year 1 (first year of 
implementation with students), representing the initial implementation stage of the model from 
Adelman and Taylor (1997). In addition, 410 schools (in 189 districts) were in years 2 to 4 of 
implementation, representing institutionalization. Einally, 238 schools (in 88 districts) had been 
implementing PBIS for 5 or more years, representing the ongoing evolution (sustainability) 
phase. The schools were located in 14 states and represented all 4 U.S. Census Bureau regions. 
National Center for Education Statistics (NCES) demographic data were available for 98% of 
schools and are presented by implementation stage group in Table 1. 

In addition to implementation stage, schools were divided into groups based on ethnic 
composition of students and school-level socioeconomic status (SES). Eor ethnic composition, 
the NCES Common Core of Data mean school-level percentage of students of color (i.e., non- 
White) was used to assign schools to groups: 534 schools had fewer than 45% students identified 
as non-White and 326 schools had 45% or more students identified as non- White. Eor SES, 
eligibility for federal Title I funds, an indicator of high numbers or percentages of students from 
low-SES families, was used to define groups: 335 schools were not eligible and 513 schools 
were eligible for Title I funds. Title I eligibility was unknown for 12 sehools (1% of sample), and 
these schools were excluded from analyses of measurement invariance across school-level SES. 


Measure 
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The School-wide Universal Behavior Sustainability Index: School Teams (SUBSIST; 
McIntosh, Doolittle, et ah, 2009) is a 39-item measure of factors predicting sustained 
implementation of a school-based practice at a level of fidelity of implementation high enough to 
continue to meet valued outcomes. Respondents (school team members or external coaches) rate 
the extent to which each variable is present in their school at the time of response on a 4-point 
scale from 1 {not true) to 4 {very true). The measure includes school-level and district-level 
items. 

Evidence of the SUBSIST’s psychometric properties come from three studies to date. 
Results of an expert panel assessment provided evidence of strong content validity (content 
validity index = .95), and a pilot study showed strong internal consistency (a = .87), interrater 
reliability (r = .95), and two-week test-retest reliability (r = .96; McIntosh et ah, 2011). Results 
of a larger validation study (McIntosh et ah, 2013) included an exploratory factor analysis and 
concurrent prediction of sustained PBIS implementation. Exploratory analyses indicated a four 
factor structure, with two school-level factors [School Priority (20 items, a = .94) and Team Use 
of Data (11 items, a = .94)] and two district-level factors [District Priority (5 items, a = .71) and 
Capacity Building (3 items, a = .74)] representing elements of the practice and its context. 
SUBSIST items by subscale are presented in McIntosh et al. (2013). Results indicated strong 
concurrent validity with PBIS implementation, with statistically significant correlations between 
each factor and PBIS fidelity of implementation scores. A cluster analysis (Hume & McIntosh, in 
press) identified valid clusters based on use of data and statistically significant correlations with 
other indicators of sustainability, including number of years implementing, access to district 
coaching, and school team actions. 


Procedure 
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After obtaining Institutional Review Board approval for the study, the authors worked 
with several state-level PBIS teams to recruit a large sample of schools at varying years of PBIS 
implementation to complete the SUBSIST. State PBIS teams recruited any schools implementing 
or preparing to implement PBIS to participate during existing PBIS training events (either initial 
or ongoing trainings) and through email contacts. Participation consisted of one member from 
each school PBIS team actively consenting to complete the SUBSIST through a secure, online 
survey program. 

Data Analyses 

Measurement invariance was investigated by fitting a series of multiple-group 
confirmatory factor analysis (CPA) models. Because items on the SUBSIST have too few 
response options and exhibited too much negative skew to be considered normally distributed 
(Eubke & Muthen, 2004), items were specified as ordinal indicators in the CPA models using the 
theta parameterization and the mean- and variance-corrected weighted least squares (WESMV) 
estimator in yiplus 1 (Muthen & Muthen, 2012). In addition, because schools were nested in 
districts, standard errors and chi-square tests of model fit were adjusted to account for district- 
level clustering using the COMPEEX option in yiplus (Asparouhov, 2005). On several items 
(ones with only two estimated thresholds in Table 3 or two dfm Table 4), the full range of 
response options was not used across all groups. Por these items, responses on the lowest two 
options were combined. Individual SUBSIST items had an average of 6.4% missing data due to 
participants endorsing items as unknown or not applicable; all available item responses were 
analyzed using the WESMV estimator (Asparouhov & Muthen, 2010) and allowed to inform 


parameter estimates. 
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To investigate configural invariance, which is the extent to which items load on the same 
factor across groups, two sets of models were estimated. In the first set, fit of the 4-factor 
SUBSIST model in each group was investigated separately (e.g., initial implementation, 
institutionalization, and sustainability for analyses of invariance by implementation stage). In the 
second set, the fit of multi-group models was investigated, with factor loadings (excluding the 
first item in each factor that was constrained to equal one) and thresholds freely estimated in 
each group and all item residuals constrained to equal one and all factor means constrained to 
equal zero in all groups for model identification. In both sets of analyses, model fit was evaluated 
based on conventional criteria (Mueller & Hancock, 2010): Comparative Eit Index (CEI) > .95 
and Root Mean Square Error of Approximation (RMSEA) and its 90% confidence interval < .05. 

To test strong measurement invariance, the multi-group models from the tests of 
configural invariance were compared to models with loadings and thresholds constrained to be 
equal across groups, with all item residuals constrained to equal one and all factor means 
constrained to equal zero in one group for model identification. Model fit was compared using 
the likelihood ratio (ER) chi-square difference test calculated using the DIEETEST option in 
yiplus. Eollowing this global test of strong invariance, possible sources of non-invariance were 
explored using a backward selection procedure (Kim & Yoon, 2011). Specifically, we fit 
baseline models with factor loadings and item thresholds constrained to be equal across groups 
and then freed the factor loadings and thresholds across groups for one item at a time in 
comparison models. Eor model identification purposes, the residual variances of the item with 
freely estimated loadings and thresholds were constrained to equal one in all groups in the 


comparison models. 
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In the ER tests comparing the fully-invariant baseline model to each of the comparison 
models, there is a greater likelihood of Type I errors due to both the number of tests and possible 
misfit in the baseline model (i.e., if one or more of the items constrained to be invariant are non- 
invariant; Stark, Chernyshenko, & Drasgow, 2006). To reduce the potential likelihood of Type I 
errors related to multiple tests, Bonferroni correction of the critical p value (.05/39 item tests = 
.001) for the ER test was used. To account for potential misspecification in the baseline model, 
the Bonferroni-adjusted critical value was adjusted further using the following equation (Oort, 
1998): 


K 


adjusted 


^ 

K + df^-df^^j 


■K 


( 1 ) 


where K and df^^ are the Bonferroni-adjusted critical chi-square value and degrees of freedom 


for the ER test, respectively, and and df^ are the chi-square value and degrees of freedom in 

the loading- and threshold-constrained baseline model. Kim and Yoon (2011) found that use of 
both Bonferroni and Oort (1998) adjustment of critical values of the ER test reduced false 
positives while maintaining adequate power in simulations of multiple-group ordinal CEA tests 
of measurement invariance. The combined Bonferroni and Oort (1998) adjustment resulted in 
chi-square critical values of 28.52 (df= 4) and 23.43 (df= 6) for the ER invariance tests across 
PBIS stage of implementation, 20.23 (df= 2) and 23.79 (df= 3) for tests across school-level 
ethnicity, and 20.39 (df= 2) and 23.99 (df= 3) for tests across school-level SES.^ 

Eollowing the tests of measurement invariance across implementation, ethnic 
composition, and SES groups, differences in latent means on all factors of the SUBSIST were 


* The following values were used in the calculation of the adjusted critical values: implementation stage, = 
3018.22, d/o= 2292; ethnicity, = 2212.05, d/o= 1499; and SES, = 2228.54, d/o= 1498. 
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investigated in a partial or full measurement invariance model, based on results of the strong 
measurement invariance analyses, with latent means constrained to equal zero in one group for 
identification purposes. Standardized mean differences (d) were calculated as an effect size 
measure using the following formula (Hancock, 2001): 



where and are latent means in two groups, and are sample sizes in the groups, and 
sf and si are variances of the latent factors. 

Results 

Results are organized in three main categories: tests of configural invariance, strong 
measurement invariance, and latent mean differences and factor correlations. 

Configural Invariance 

Eit indices for the models testing configural invariance are presented in Table 2. In 
general, model fit of the four-factor SUBSIST model was quite similar across stages of 
implementation (i.e., initial implementation, institutionalization, or sustainability), school student 
ethnic composition (i.e., <45% non- White or > 45% non-White), and school SES (i.e., eligible 
or not eligible for Title I). RMSEA values (range: .036 - .038) and the 90% C.I. RMSEA for the 
separate group models were within acceptable limits. CEI values (range: .940 - .950) were 
approximately equal to values suggesting strong model fit (CEI > .95; Mueller & Hancock, 

2010). Similarly, model fit of the multi-group models by stage of implementation, student ethnic 
distribution, and school SES supported configural invariance, with RMSEA values (range: .035 - 
.037) indicating acceptable fit and CEI values (range: .945 - .949) approximately equal to 
conventional criteria for acceptable fit. 
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Strong Invariance 

To investigate strong measurement invariance, the fit of the multi-group configural 
invariance models was compared to models with factor loadings and item thresholds constrained 
to be equal across groups. Results are presented separately for analyses by implementation 
group, school ethnic composition, and school SES. 

Stage of PBIS Implementation. Across implementation groups, the configural model fit 
better than the strong invariance model, xL = 291.71, df= 204, p < .001. Despite the statistically 

significant difference in chi-square between models, changes in CEI (.002) and RMSEA (.002) 
were negligible and suggested better fit for the strong invariance model. To further investigate 
potential sources of non-invariance, factor loadings and thresholds were freed one item at a time. 
Table 3 presents chi-square values for the individual ER tests. The ER chi-square values 
exceeded the Bonferroni- and Oort-adjusted critical values on only one item (School Priority: 
Item 17). Eor this item, threshold values decreased across the implementation groups, indicating 
that the boundaries between response options were at lower levels of latent School Priority as 
PBIS implementation time increased. This pattern could be related to the strong and increasing 
negative skew on the item across groups, initial implementation: -.42, institutionalization: -1.07, 
sustainability: -1.94. Pit for the final, partially-invariant model with loadings and thresholds free 
for the non-invariant item continued to be adequate on all indicators other than chi-square, x = 
2995.13, df= 2286, p < .001, CPI = .952, RMSEA = .033, 90% RMSEA Cl [.030, .036]. 
Standardized factor loadings and item thresholds for this model are presented in Table 3. 

School Ethnic Composition. Across schools with > 45% and < 45% non- White students, 
the configural model fit better than the strong invariance model based on the ER test, x^r = 


170.09, df= 107, p < .001; however, both the CPI and RMSEA improved, albeit negligibly (.001 
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and .002, respectively), in the strong invariance model. Chi-square values for the ER tests of 
models with loadings and thresholds freely estimated across groups relative to the strong 
invariance model are presented in Table 4. None of the ER chi-square values exceeded the 
Bonferroni- and Oort-adjusted critical values. Although the ER test of the configural vs. strong 
invariance model indicated that the configural model fit better, CEI and RMSEA values 
suggested that the strong invariance model fit better and strong invariance was supported in tests 
of individual SUBSIST items. Eit of the strong invariance model was acceptable on most 
indicators other than chi-square, = 2212.05, df= 1499, p < .001, CEI = .948, RMSEA = .033, 
90% RMSEA Cl [.030, .036]. 

School-Level SES. Across schools that differed in Title I eligibility, there was no 
statistically significant reduction in model fit as factor loadings and item thresholds were 
constrained to equality in the strong invariance model, xL = ^ 17.22, df= 106, p = .215. In 

addition, none of the item-specific ER chi-square values exceeded the adjusted chi-square critical 
values, as presented in Table 4. Eit of the strong invariance model was acceptable on indicators 
other than chi-square, x^ = 2228.54, df= 1498, p < .001, CEI = .950, RMSEA = .034, 90% 
RMSEA Cl [.031, .037]. 

Latent Variable Correlations and Means 

Correlations among the latent variables, factor means, and factor standard deviations in 
the partially-invariant implementation group model and the strong invariance ethnic composition 
and SES models are presented in Table 5. In general, all factors were positively and strongly 
correlated (rs = .53 to .85, p < .001) in all groups. Regarding the PBIS implementation model, 
none of the factor means in the institutionalization group significantly differed from means in the 
initial implementation group. In contrast, means for School Priority and Team Use of Data were 
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higher (p = .008 and .001, respectively) in the sustainability group compared to the initial 
implementation group, and these differences were of small to medium magnitude (d = .36 and 
.52, respectively). In the school ethnic composition model, the mean of the Team Use of Data 
factor was lower in schools with 45% or more non-White students ip = .04), but the difference 
was of small magnitude (d = .24). In the school-level SES model, the mean of the School Priority 
factor was higher in schools eligible for Title I funds ip = .02), and the difference was also of 
small magnitude (d = .20). No other latent means differed statistically from reference group 
means in the models. 

Discussion 

Overall, results provided additional support for the two school- and two district-level 
factor structure of the SUBSIST in a sample independent from the one used in McIntosh et al. 
(2013). In addition, the results supported the four-factor solution across groups of schools that 
differed in duration of PBIS implementation, student ethnic composition, and student SES. 
Configural invariance of the SUBSIST was supported by adequate fit of the four-factor model in 
each group and the multi-group models. 

Results also supported strong measurement invariance across implementation groups for 
all items except one on the SUBSIST and for all items across school-level ethnicity and SES 
groups. Although change in approximate fit indexes from the configural to strong measurement 
invariance model was below the commonly-used criterion of change less than .01 on the CEI 
(Cheung & Rensvold, 2002), use of the backward selection procedure described in Kim and 
Yoon (201 1) with Bonferroni and Oort (1998) adjustment of critical values for the ER test 
indicated that the item “SW-PBIS is considered to be a typical operating procedure of the school 
(it has become “what we do here/what we’ve always done”)” had different factor loadings and/or 
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item thresholds across implementation groups. Eogically, one would expect that schools that 
have been implementing PBIS longer would be more likely to perceive PBIS as a typical 
practice, and indeed, the item means increased across implementation groups (initial 
implementation: 2.96, institutionalization: 3.34, sustainability: 3.66) with more pronounced 
negative skew in groups with greater implementation time. In a related study with the original 
validation sample, this item was not perceived as important for initial PBIS implementation, but 
considering PBIS to be a typical operating procedure was rated by school and district PBIS team 
members as among the most important factors related to sustained implementation (McIntosh et 
ah, 2014). Consequently, it is not surprising that as implementation time increased, it became 
easier for school teams (i.e., required less School Priority) to rate the item as more true of the 
school. 

Similarly to the results of McIntosh et al. (2013), strong positive correlations were found 
among the four SUBSIST factors, and the magnitude of correlations appeared to be similar 
across implementation groups in the final, partially-invariant model that freed factor loadings 
and thresholds for the “typical operating procedure” item and across groups in the fully-invariant 
school-level ethnicity and SES models. Inspection of latent factor mean differences indicated no 
statistically significant differences between the initial implementation and institutionalization 
groups; however, levels of School Priority and Team Use of Data were greater, with small to 
medium size differences, in the sustainability compared to the initial implementation group. 
These findings are consistent with perceptions of school and district PBIS team members that 
school-level factors, particularly administrator priority, are more important than district-level 
factors for sustained implementation (McIntosh et al., 2013). In addition, the current finding that 
the largest mean differences were on Team Use of Data is consistent with prior research 
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indicating that this factor had the largest independent association with PBIS implementation 
fidelity while accounting for the other factors on the SUBSIST (McIntosh et ah, 2013). Small 
and statistically significant latent mean differences were also found in the school-level ethnicity 
and SES models. Schools with 45% or more students identified as non-White had lower scores 
on Team Use of Data compared to schools with fewer non- White students, and schools eligible 
for Title I funds had higher scores on School Priority than schools not eligible for funds. 
Although sometimes poverty is viewed as a strong barrier to sustainability, due to reduced access 
to resources (Rogers, 2003), staff in schools serving more students in poverty may view 
preventive practices such as PBIS as more valuable, and thus would rate it as a higher priority. 

These findings should be considered in light of several limitations. Eirst, the SUBSIST is 
a survey measure of perceptions completed by school or district PBIS team members, and it is 
possible that team member perceptions may not be representative of all team members in the 
school. This concern is partially tempered by findings of high inter-rater reliability for the 
SUBSIST in a prior investigation (McIntosh et ah, 2011); however, results of the current study 
would be strengthened by inclusion of reports from multiple school team members and/or the 
addition of direct observation measures of factors related to sustainability. In addition, although 
groups that varied in terms of years of PBIS implementation were included in the study, the 
study was cross sectional. Consequently, the extent to which longitudinal invariance would hold 
as the same schools continue to implement PBIS is unclear. 

The current findings of strong measurement invariance, excluding one item, across 
implementation groups for the SUBSIST is an important first step toward inclusion of the 
SUBSIST in a longitudinal investigation examining the dynamic interrelation of factors related 
to sustainability and fidelity of PBIS. Although Team Use of Data was found to be the most 
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important factor in relation to implementation fidelity (McIntosh et ah, 2013), the predictive 
power of the SUBSIST factors has not been examined prospectively. In addition to the predictive 
strength of each SUBSIST factor, future research should also examine the extent to which these 
associations change across stages of implementation. It is possible that the SUBSIST factors 
most predictive of sustainability during initial implementation differ from the factors predicting 
continued implementation after several years of institutionalization. We hope that by setting the 
stage for future longitudinal research, the current study will contribute to rigorous examination of 
factors that promote the sustained implementation of evidence-based school initiatives. 
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Table 1 

School Demographics by Stage of Implementation 


Stage 

Initial Implementation 
in = 209) 

M or % iSD) 
Institutionalization 
in = 408) 

Sustainability 
in = 233) 

Enrollment 

517.62 (375.84) 

533.28 (344.24) 

565.62 (299.66) 

Title 1 Eligible 

63% (48) 

64% (48) 

52% (50) 

% Non- White 

35% (33) 

43% (32) 

39% (25) 

Grade Eevel 

Elementary 

61% 

72% 

68% 

Middle 

25% 

16% 

22% 

High 

14% 

12% 

10% 


Note. Data obtained from National Center for Education Statistics for 98% of schools. 
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Table 2 


Model Fit in Tests of Configural Invariance 


Model 

Group 


df 

CEI 

RMSEA 

90% RMSEA 

Stage 

Initial Implementation 

894.69 

696 

.940 

.037 

.029 - .044 


Institutionalization 

1075.90 

696 

.949 

.036 

.032 - .041 


Sustainability 

913.91 

696 

.942 

.036 

.029 - .043 


Combined 

2840.79 

2088 

.949 

.035 

.032 - .039 

Ethnicity 

<45% non-White 

1303.57 

696 

.941 

.040 

.037 - .044 


>45% non-White 

953.58 

696 

.950 

.034 

.028 - .039 


Combined 

2128.33 

1392 

.947 

.035 

.032 - .038 

Title I 

Eligible 

1208.01 

696 

.946 

.038 

.034 - .041 


Not Eligible 

1020.33 

696 

.942 

.037 

.032 - .042 


Combined 

2189.09 

1392 

.945 

.037 

.034 - .040 


Note, n = 860 for analyses by implementation stage, n = 860 for analyses by ethnicity, and n = 


848 for analyses by school Title 1 eligibility. 
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Table 3 


Factor Loadings and Thresholds for the Partially Invariant Model and Likelihood Ratio Tests of 
Measurement Invariance Across Stage of Implementation 


Eevel Eactor 
School School Priority 


Team Use of Data 


District District Priority 


Item 

L 

Yi 

Y2 

Y3 

i\r 

dfLR 

P 

1 

A1 

-1.67 

-0.77 

— 

1.92 

4 

.751 

2 

.58 

-1.77 

-0.38 

— 

5.82 

4 

.213 

3 

.64 

-2.51 

-1.59 

-0.45 

9.41 

6 

.152 

4 

.62 

-2.74 

-1.58 

-0.14 

3.51 

6 

.743 

5 

.51 

-0.50 

0.86 

1.80 

3.58 

6 

.733 

6 

.67 

-2.01 

-0.77 

— 

16.35 

4 

.003 

7 

.64 

-2.78 

-1.86 

-0.65 

11.34 

6 

.079 

8 

.37 

-2.19 

-1.55 

-0.76 

8.08 

6 

.232 

9 

.59 

-2.31 

-0.93 

— 

7.75 

4 

.101 

10 

.70 

-3.00 

-1.49 

0.51 

15.11 

6 

.019 

11 

.69 

-2.17 

-1.34 

-0.17 

10.85 

6 

.093 

12 

.45 

-1.76 

-0.81 

0.35 

14.67 

6 

.023 

13 

.70 

-3.61 

-2.44 

-0.60 

11.58 

6 

.072 

14 

.63 

-2.21 

-1.01 

0.05 

14.26 

6 

.027 

15 

.76 

-2.86 

-1.56 

0.38 

3.44 

6 

.752 

16 

.62 

-2.68 

-0.97 

— 

A.13 

4 

.316 

17" 

.80 

-2.60 

-0.86 

0.77 

64.95* 

6 

.000 


.73 

-2.73 

-1.51 

-0.06 





.80 

-3.26 

-2.27 

-0.52 




18 

.57 

-2.87 

-1.50 

0.24 

13.48 

6 

.036 

19 

.55 

-2.44 

-1.30 

0.20 

4.36 

6 

.628 

20 

.59 

-2.30 

-0.59 

— 

1.70 

4 

.791 

1 

.64 

-2.97 

-1.59 

0.04 

9.84 

6 

.131 

2 

.72 

-1.79 

0.09 

— 

23.29 

4 

.000 

3 

.51 

-2.32 

-1.63 

-0.86 

19.21 

6 

.004 

4 

.52 

-2.13 

-1.12 

-0.22 

7.70 

6 

.261 

5 

.63 

-2.24 

-1.31 

-0.24 

5.86 

6 

.439 

6 

.67 

-2.37 

-1.29 

0.07 

7.38 

6 

.287 

7 

.84 

-3.06 

-1.49 

0.21 

15.06 

6 

.020 

8 

.80 

-2.18 

-0.80 

0.41 

8.80 

6 

.185 

9 

.84 

-1.79 

-0.68 

0.73 

23.12 

6 

.001 

10 

.88 

-4.31 

-2.16 

-0.03 

13.51 

6 

.036 

11 

.59 

-2.19 

-1.34 

0.04 

8.52 

6 

.202 

1 

.55 

-1.38 

-0.38 

0.93 

9.06 

6 

.170 

2 

.69 

-2.93 

-1.39 

-0.06 

6.86 

6 

.334 

3 

.55 

-1.83 

-0.87 

0.15 

9.75 

6 

.135 

4 

.79 

-2.24 

-0.53 

1.30 

10.59 

6 

.102 

5 

.71 

-2.46 

-1.14 

0.02 

18.84 

6 

.004 

1 

.60 

-2.19 

-1.17 

-0.27 

2.35 

6 

.885 

2 

.78 

-3.04 

-1.75 

-0.40 

8.45 

6 

.207 


Capacity Building 
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3 .62 -1.79 -0.80 0.13 9.23 6 .161 

Note, n = 860. xL = likelihood ratio chi-square, df^^ = degrees of freedom for likelihood ratio 

test. Presented factor loadings are standardized. 

‘‘Group- specific loadings and thresholds are presented for this item. 

Chi-square value exceeds Bonferroni- and Oort-adjusted critical value. 
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Table 4 


Likelihood Ratio Tests of Measurement Invariance by School-Level Ethnicity and SES 





Ethnicity 




Title 1 


Eevel 

Eactor 

Item 

iIr 


P 

i\r 

dfcR 

P 

School 

School Priority 

1 

10.67 

2 

.005 

7.60 

2 

.022 



2 

2.38 

2 

.305 

.64 

2 

nil 



3 

2.79 

3 

.425 

11.75 

3 

.008 



4 

1.39 

3 

.707 

3.60 

3 

.308 



5 

4.66 

3 

.199 

1.95 

3 

.583 



6 

3.62 

3 

.306 

2.01 

3 

.570 



7 

1.77 

3 

.623 

1.47 

3 

.689 



8 

11.90 

3 

.008 

4.19 

3 

.242 



9 

3.34 

3 

.343 

3.31 

3 

.346 



10 

7.47 

3 

.058 

2.07 

3 

.559 



11 

2.18 

3 

.535 

4.89 

3 

.180 



12 

1.2S 

3 

.063 

1.02 

3 

.797 



13 

4.52 

3 

.210 

2.55 

3 

.467 



14 

11.84 

3 

.008 

3.02 

3 

.388 



15 

5.80 

3 

.122 

8.22 

3 

.042 



16 

1.98 

3 

.578 

1.40 

2 

.496 



17 

3.78 

3 

.286 

3.76 

3 

.288 



18 

3.08 

3 

.380 

1.05 

3 

.790 



19 

19.10 

3 

.000 

6.22 

3 

.101 



20 

12.82 

3 

.005 

2.58 

3 

.461 


Team Use of Data 

1 

2.39 

3 

.495 

4.21 

3 

.239 



2 

1.25 

3 

.740 

8.59 

3 

.035 



3 

2.31 

3 

.512 

6.25 

3 

.100 



4 

2.12 

3 

.548 

3.18 

3 

.365 



5 

1.54 

3 

.674 

2.33 

3 

.507 



6 

8.37 

3 

.039 

2.11 

3 

.549 



7 

1.84 

3 

.606 

1.65 

3 

.649 



8 

2.68 

3 

.444 

2.66 

3 

.447 



9 

5.85 

3 

.119 

2.05 

3 

.562 



10 

3.71 

3 

.294 

3.43 

3 

.331 



11 

2.81 

3 

.422 

1.37 

3 

.713 

District 

District Priority 

1 

4.30 

3 

.231 

3.06 

3 

.382 



2 

.70 

3 

.873 

1.84 

3 

.606 



3 

15.91 

3 

.001 

4.25 

3 

.236 



4 

5.41 

3 

.144 

4.66 

3 

.198 



5 

2.63 

3 

.452 

5.56 

3 

.135 


Capacity Building 

1 

13.60 

3 

.004 

3.50 

3 

.321 



2 

10.94 

3 

.012 

9.01 

3 

.029 



3 

.94 

3 

.816 

2.99 

3 

.393 
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Note, n = 860 for analyses by ethnicity, and n = 848 for analyses by school Title 1 eligibility. 
XiR = likelihood ratio chi-square, = degrees of freedom for likelihood ratio test. 
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Table 5 


Factor Correlations, Standard Deviations, and I 


Group Eactor 

Initial Implementation School Priority 

Team Use of Data 
District Priority 
Capacity Building 

Institutionalization School Priority 

Team Use of Data 
District Priority 
Capacity Building 

Sustainability School Priority 

Team Use of Data 
District Priority 

Capacity Building 

<45% non- White School Priority 

Team Use of Data 
District Priority 
Capacity Building 

>45% non- White School Priority 

Team Use of Data 
District Priority 

Capacity Building 

Title I Not Eligible School Priority 

Team Use of Data 
District Priority 
Capacity Building 

Title I Eligible School Priority 

Team Use of Data 
District Priority 

Capacity Building 

Note. SP = School Priority, TUD = Team Use of 


Institutionalization 


Sustainability 


<45% non- White 


>45% non- White 


Title I Not Eligible 


Title I Eligible 


‘^Parameter constrained for model identification. 


p<.05, p<.0l, p<.00l 


31 



SP 

TUD 

DP 

M 

SD 

— 



.00" 

.53 

.78"* 




.00" 

.84 



.77 

.64*** 

— 

.00" 

.65 

.61*" 

.66*** 

.72 

.00" 

.75 

— 



.01 

.56 

.80*** 

— 


.13 

1.04 

.70*** 

.62*** 

— 

.02 

.69 

.69*** 

.75*** 

.74*** 

-.04 

.81 

— 



.20 

.58 

.85*** 

— 


.47 

.97 

.62*** 

.53*** 

— 

.07 

.65 

.58*** 

.64*** 

.62*** 

.01 

.76 

— 



.00" 

.67 

81 *** 




.00" 

1.15 

.72*** 

.64*** 

— 

.00" 

.70 

.62*** 

.75*** 

.69*** 

.00" 

.90 

— 



-.04 

.75 

.84*** 

— 


* 

-.28 

1.23 

.67*** 

61 *** 

— 

.06 

.68 

.68*** 

.66*** 



.77 

-.11 

.88 

— 



.00" 

.58 

.79*** 




.00" 

1.01 

.69*** 

.57*** 

— 

.00" 

.68 


.76*** 



.77 

.00" 

.84 

— 



* 

.11 

.56 

.83*** 

— 


.12 

.98 

.68*** 

.62*** 

— 

.10 

.66 

.60*** 

.67*** 

.68*** 

.10 

.94 


, DP = District Priority. 



